Computing Runs on a General Alphabet

نویسنده

  • Dmitry Kosolobov
چکیده

We describe a RAM algorithm computing all runs (=maximal repetitions) of a given string of length n over a general ordered alphabet in O(n log 2 3 n) time and linear space. Our algorithm outperforms all known solutions working in Θ(n log σ) time provided σ = n, where σ is the number of distinct letters in the input string. We conjecture that there exists a linear time RAM algorithm finding all runs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Runs in Independent Sequences

Given an i.i.d. sequence of n letters from a finite alphabet, we consider the length of the longest run of any letter. In the equiprobable case, results for this run turn out to be closely related to the well-known results for the longest run of a given letter. For coin-tossing, tail probabilities are compared for both kinds of runs via Poisson approximation.

متن کامل

Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries

Longest common extension queries (LCE queries) and runs are ubiquitous in algorithmic stringology. Linear-time algorithms computing runs and preprocessing for constant-time LCE queries have been known for over a decade. However, these algorithms assume a linearly-sortable integer alphabet. A recent breakthrough paper by Bannai et. al. (SODA 2015) showed a link between the two notions: all the r...

متن کامل

A Further Note on Runs in Independent Sequences

Given a sequence of letters generated independently from a finite alphabet, we consider the case when more than one, but not all, letters are generated with the highest probability. The length of the longest run of any of these letters is shown to be one greater than the length of the longest run in a particular state of an associated Markov chain. Using results of Foulser and Karlin (19...

متن کامل

New Algorithms for the Longest Common Subsequence Problem New Algorithms for the Longest Common Subsequence Problem New Algorithms for the Longest Common Subsequence Problem

Given two sequences A = a 1 a 2 : : :a m and B = b 1 b 2 : : :b n , m n, over some alphabet , a common subsequence C = c 1 c 2 : : :c l of A and B is a sequence that can be obtained from both A and B by deleting zero or more (not necessarily adjacent) symbols. Finding a common subsequence of maximallength is called the Longest CommonSubsequence (LCS) Problem. Two new algorithms based on the wel...

متن کامل

Faster Longest Common Extension Queries in Strings over General Alphabets

Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of q LCE queries for a string of size n over a general ordered alphabet can be realized in O(q log log n + n log n) time making only O(q + n) symbol comparisons. C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 116  شماره 

صفحات  -

تاریخ انتشار 2016